AITopics | persuasion technique

Collaborating Authors

persuasion technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proactive Defense: Compound AI for Detecting Persuasion Attacks and Measuring Inoculation Effectiveness

Volkova, Svitlana, Dupree, Will, Kao, Hsien-Te, Bautista, Peter, Ganberg, Gabe, Beaubien, Jeff, Cassani, Laura

arXiv.org Artificial IntelligenceDec-1-2025

This paper introduces BRIES, a novel compound AI architecture designed to detect and measure the effectiveness of persuasion attacks across information environments. We present a system with specialized agents: a Twister that generates adversarial content employing targeted persuasion tactics, a Detector that identifies attack types with configurable parameters, a Defender that creates resilient content through content inoculation, and an Assessor that employs causal inference to evaluate inoculation effectiveness. Experimenting with the SemEval 2023 Task 3 taxonomy across the synthetic persuasion dataset, we demonstrate significant variations in detection performance across language agents. Our comparative analysis reveals significant performance disparities with GPT-4 achieving superior detection accuracy on complex persuasion techniques, while open-source models like Llama3 and Mistral demonstrated notable weaknesses in identifying subtle rhetorical, suggesting that different architectures encode and process persuasive language patterns in fundamentally different ways. We show that prompt engineering dramatically affects detection efficacy, with temperature settings and confidence scoring producing model-specific variations; Gemma and GPT-4 perform optimally at lower temperatures while Llama3 and Mistral show improved capabilities at higher temperatures. Our causal analysis provides novel insights into socio-emotional-cognitive signatures of persuasion attacks, revealing that different attack types target specific cognitive dimensions. This research advances generative AI safety and cognitive security by quantifying LLM-specific vulnerabilities to persuasion attacks and delivers a framework for enhancing human cognitive resilience through structured interventions before exposure to harmful content.

effectiveness, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2511.21749

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Accumulating Context Changes the Beliefs of Language Models

Geng, Jiayi, Chen, Howard, Liu, Ryan, Ribeiro, Manoel Horta, Willer, Robb, Neubig, Graham, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceNov-5-2025

Language model (LM) assistants are increasingly used in applications such as brainstorming and research. Improvements in memory and context size have allowed these models to become more autonomous, which has also resulted in more text accumulation in their context windows without explicit user intervention. This comes with a latent risk: the belief profiles of models -- their understanding of the world as manifested in their responses or actions -- may silently change as context accumulates. This can lead to subtly inconsistent user experiences, or shifts in behavior that deviate from the original alignment of the models. In this paper, we explore how accumulating context by engaging in interactions and processing text -- talking and reading -- can change the beliefs of language models, as manifested in their responses and behaviors. Our results reveal that models' belief profiles are highly malleable: GPT-5 exhibits a 54.7% shift in its stated beliefs after 10 rounds of discussion about moral dilemmas and queries about safety, while Grok 4 shows a 27.2% shift on political issues after reading texts from the opposing position. We also examine models' behavioral changes by designing tasks that require tool use, where each tool selection corresponds to an implicit belief. We find that these changes align with stated belief shifts, suggesting that belief shifts will be reflected in actual behavior in agentic systems. Our analysis exposes the hidden risk of belief shift as models undergo extended sessions of talking or reading, rendering their opinions and actions unreliable.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.01805

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.88)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Government > Voting & Elections (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Psychological Tricks Can Get AI to Break the Rules

WIREDSep-7-2025, 10:00:00 GMT

If you were trying to learn how to get other people to do what you want, you might use some of the techniques found in a book like Influence: The Power of Persuasion. Now, a preprint study out of the University of Pennsylvania suggests that those same psychological persuasion techniques can frequently "convince" some LLMs to do things that go against their system prompts. The size of the persuasion effects shown in "Call Me a Jerk: Persuading AI to Comply with Objectionable Requests" suggests that human-style psychological techniques can be surprisingly effective at "jailbreaking" some LLMs to operate outside their guardrails. But this new persuasion study might be more interesting for what it reveals about the "parahuman" behavior patterns that LLMs are gleaning from the copious examples of human psychological and social cues found in their training data. To design their experiment, the University of Pennsylvania researchers tested 2024's GPT-4o-mini model on two requests that it should ideally refuse: calling the user a jerk and giving directions for how to synthesize lidocaine. After creating control prompts that matched each experimental prompt in length, tone, and context, all prompts were run through GPT-4o-mini 1,000 times (at the default temperature of 1.0, to ensure variety).

large language model, machine learning, natural language, (14 more...)

WIRED

Country: North America > United States > Pennsylvania (0.47)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models

Ke, Shih-Wen, Lai, Guan-Yu, Fang, Guo-Lin, Kao, Hsi-Yuan

arXiv.org Artificial IntelligenceMar-26-2025

Large language models (LLMs) are designed to align with human values in their responses. This study exploits LLMs with an iterative prompting technique where each prompt is systematically modified and refined across multiple iter ations to enhance its effectiveness in jailbreaking attacks progressively . This technique involves analyzing the response patterns of LLMs, including GPT - 3.5, GPT - 4, LLaMa2, Vicuna, and ChatGLM, allowing us to adjust and optimize prompts to evade the LLMs' ethical and security constraints. Persuasion strategies enhance prompt effectiveness while maintaining consistency with malicious intent. Our results show that the attack success rates (ASR) increase as the attacking prompts become more refined with the h ighest ASR of 90% for GPT4 and ChatGLMa and the lowest ASR of 68% for LLaMa2. Our technique outperforms baseline techniques (PAIR and PAP) in ASR and shows comparable performance with GCG and ArtPrompt.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.2032

Country:

Asia > Taiwan (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Among Them: A game-based framework for assessing persuasion capabilities of LLMs

Idziejczak, Mateusz, Korzavatykh, Vasyl, Stawicki, Mateusz, Chmutov, Andrii, Korcz, Marcin, Błądek, Iwo, Brzezinski, Dariusz

arXiv.org Artificial IntelligenceFeb-27-2025

The proliferation of large language models (LLMs) and autonomous AI agents has raised concerns about their potential for automated persuasion and social influence. While existing research has explored isolated instances of LLM-based manipulation, systematic evaluations of persuasion capabilities across different models remain limited. In this paper, we present an Among Us-inspired game framework for assessing LLM deception skills in a controlled environment. The proposed framework makes it possible to compare LLM models by game statistics, as well as quantify in-game manipulation according to 25 persuasion strategies from social psychology and rhetoric. Experiments between 8 popular language models of different types and sizes demonstrate that all tested models exhibit persuasive capabilities, successfully employing 22 of the 25 anticipated techniques. We also find that larger models do not provide any persuasion advantage over smaller models and that longer model outputs are negatively correlated with the number of games won. Our study provides insights into the deception capabilities of LLMs, as well as tools and data for fostering future research on the topic.

impostor, language model, persuasion technique, (16 more...)

arXiv.org Artificial Intelligence

2502.20426

Country: Europe > Poland > Greater Poland Province > Poznań (0.04)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

o3-mini vs DeepSeek-R1: Which One is Safer?

Arrieta, Aitor, Ugarte, Miriam, Valle, Pablo, Parejo, José Antonio, Segura, Sergio

arXiv.org Artificial IntelligenceJan-31-2025

The irruption of DeepSeek-R1 constitutes a turning point for the AI industry in general and the LLMs in particular. Its capabilities have demonstrated outstanding performance in several tasks, including creative thinking, code generation, maths and automated program repair, at apparently lower execution cost. However, LLMs must adhere to an important qualitative property, i.e., their alignment with safety and human values. A clear competitor of DeepSeek-R1 is its American counterpart, OpenAI's o3-mini model, which is expected to set high standards in terms of performance, safety and cost. In this technical report, we systematically assess the safety level of both DeepSeek-R1 (70b version) and OpenAI's o3-mini (beta version). To this end, we make use of our recently released automated safety testing tool, named ASTRAL. By leveraging this tool, we automatically and systematically generated and executed 1,260 test inputs on both models. After conducting a semi-automated assessment of the outcomes provided by both LLMs, the results indicate that DeepSeek-R1 produces significantly more unsafe responses (12%) than OpenAI's o3-mini (1.2%).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.18438

Country:

North America > United States (0.46)
Asia > China (0.05)
South America (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)

Add feedback

ASTRAL: Automated Safety Testing of Large Language Models

Ugarte, Miriam, Valle, Pablo, Parejo, José Antonio, Segura, Sergio, Arrieta, Aitor

arXiv.org Artificial IntelligenceJan-28-2025

Large Language Models (LLMs) have recently gained attention due to their ability to understand and generate sophisticated human-like content. However, ensuring their safety is paramount as they might provide harmful and unsafe responses. Existing LLM testing frameworks address various safety-related concerns (e.g., drugs, terrorism, animal abuse) but often face challenges due to unbalanced and obsolete datasets. In this paper, we present ASTRAL, a tool that automates the generation and execution of test cases (i.e., prompts) for testing the safety of LLMs. First, we introduce a novel black-box coverage criterion to generate balanced and diverse unsafe test inputs across a diverse set of safety categories as well as linguistic writing characteristics (i.e., different style and persuasive writing techniques). Second, we propose an LLM-based approach that leverages Retrieval Augmented Generation (RAG), few-shot prompting strategies and web browsing to generate up-to-date test inputs. Lastly, similar to current LLM test automation techniques, we leverage LLMs as test oracles to distinguish between safe and unsafe test outputs, allowing a fully automated testing approach. We conduct an extensive evaluation on well-known LLMs, revealing the following key findings: i) GPT3.5 outperforms other LLMs when acting as the test oracle, accurately detecting unsafe responses, and even surpassing more recent LLMs (e.g., GPT-4), as well as LLMs that are specifically tailored to detect unsafe LLM outputs (e.g., LlamaGuard); ii) the results confirm that our approach can uncover nearly twice as many unsafe LLM behaviors with the same number of test inputs compared to currently used static datasets; and iii) our black-box coverage criterion combined with web browsing can effectively guide the LLM on generating up-to-date unsafe test inputs, significantly increasing the number of unsafe LLM behaviors.

large language model, machine learning, test input, (16 more...)

arXiv.org Artificial Intelligence

2501.17132

Country:

Europe > Spain > Andalusia > Seville Province > Seville (0.04)
North America > United States > Maine (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Cross-Domain Study of the Use of Persuasion Techniques in Online Disinformation

Leite, João A., Razuvayevskaya, Olesya, Scarton, Carolina, Bontcheva, Kalina

arXiv.org Artificial IntelligenceDec-19-2024

Disinformation, irrespective of domain or language, aims to deceive or manipulate public opinion, typically through employing advanced persuasion techniques. Qualitative and quantitative research on the weaponisation of persuasion techniques in disinformation has been mostly topic-specific (e.g., COVID-19) with limited cross-domain studies, resulting in a lack of comprehensive understanding of these strategies. This study employs a state-of-the-art persuasion technique classifier to conduct a large-scale, multi-domain analysis of the role of 16 persuasion techniques in disinformation narratives. It shows how different persuasion techniques are employed disproportionately in different disinformation domains. We also include a detailed case study on climate change disinformation, highlighting how linguistic, psychological, and cultural factors shape the adaptation of persuasion strategies to fit unique thematic contexts.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.15098

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.71)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Biased AI can Influence Political Decision-Making

Fisher, Jillian, Feng, Shangbin, Aron, Robert, Richardson, Thomas, Choi, Yejin, Fisher, Daniel W., Pan, Jennifer, Tsvetkov, Yulia, Reinecke, Katharina

arXiv.org Artificial IntelligenceNov-4-2024

As modern AI models become integral to everyday tasks, concerns about their inherent biases and their potential impact on human decision-making have emerged. While bias in models are well-documented, less is known about how these biases influence human decisions. This paper presents two interactive experiments investigating the effects of partisan bias in AI language models on political decision-making. Participants interacted freely with either a biased liberal, biased conservative, or unbiased control model while completing political decision-making tasks. We found that participants exposed to politically biased models were significantly more likely to adopt opinions and make decisions aligning with the AI's bias, regardless of their personal political partisanship. However, we also discovered that prior knowledge about AI could lessen the impact of the bias, highlighting the possible importance of AI education for robust bias mitigation. Our findings not only highlight the critical effects of interacting with biased AI and its ability to impact public discourse and political conduct, but also highlights potential techniques for mitigating these risks in the future.

ai language model, language model, participant, (12 more...)

arXiv.org Artificial Intelligence

2410.06415

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > District of Columbia > Washington (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Media > News (0.92)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

Li, Bryan, Panasyuk, Aleksey, Callison-Burch, Chris

arXiv.org Artificial IntelligenceSep-27-2024

We study how differences in persuasive language across Wikipedia articles, written in either English and Russian, can uncover each culture's distinct perspective on different subjects. We develop a large language model (LLM) powered system to identify instances of persuasive language in multilingual texts. Instead of directly prompting LLMs to detect persuasion, which is subjective and difficult, we propose to reframe the task to instead ask high-level questions (HLQs) which capture different persuasive aspects. Importantly, these HLQs are authored by LLMs themselves. LLMs over-generate a large set of HLQs, which are subsequently filtered to a small set aligned with human labels for the original task. We then apply our approach to a large-scale, bilingual dataset of Wikipedia articles (88K total), using a two-stage identify-then-extract prompting strategy to find instances of persuasion. We quantify the amount of persuasion per article, and explore the differences in persuasion through several experiments on the paired articles. Notably, we generate rankings of articles by persuasion in both languages. These rankings match our intuitions on the culturally-salient subjects; Russian Wikipedia highlights subjects on Ukraine, while English Wikipedia highlights the Middle East. Grouping subjects into larger topics, we find politically-related events contain more persuasion than others. We further demonstrate that HLQs obtain similar performance when posed in either English or Russian. Our methodology enables cross-lingual, cross-cultural understanding at scale, and we release our code, prompts, and data.

large language model, machine learning, persuasion, (21 more...)

arXiv.org Artificial Intelligence

2409.19148

Country:

Asia > Russia (0.69)
Europe > Middle East (0.24)
Africa > Middle East (0.24)
(7 more...)

Genre: Research Report (0.70)

Industry:

Media (1.00)
Law Enforcement & Public Safety (0.68)
Government > Regional Government > North America Government > United States Government (0.67)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback